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Section I 



INTRODUCTORY REMARKS 



Introduction. 
Purpose 



Paragraph 

1 

2 



Arrangement of contents 



Paragraph 

3 



1. Introduction. — a. An examination of either plain-text or cryptographic text will convince 
the reader that the occurrences of the various textual elements do not follow a definite -rigorous 
mathematical law. 

b. In the solution of a cryptogram the cryptanalyst deals almost exclusively with uncer- 

tainties as regards the relationships of its textual elements. Accordingly, he is concerned with 
the questio n : lYhat is. the. probability of. a certain event? Of course,, there are certain causative 
or controlling factors which determine whether or not the event takes place and with sufficient 
information the answer to the question would be either: “It is certain to oocur,” or “It is 
certain not to occur.” > . ■ . .. • 

c. The mathematical theory of probability and statistics is accordingly of importance to 
the cryptanalyst since it provides a means for the quantitative analysis of the, uncertainties 
with which he deals. It also provides a means whereby he may study the behavior of groups 
of symbols and draw conclusions therefrom. 

d. It is not very often that statistical analysis alone will enable the cryptanalyst to arrive 
at the solution of a cryptogram. Statistical analysis will, however, enable the cryptanalyst to 
evaluate the desirability of pursuing certain procedures and will indicate the most likely order in 
which to try various possible steps in solution. 

e. Of fundamental importance in the application of statistical technique to cryptography 
are the various frequency tables relating to the characteristic frequencies of textual elements of 
different languages. A number of such tables will be found in section VIII. 

. J. It must be emphasized here that the methods and procedures tp be, discussed herein are 
a means to an end, and not an end in themselves. 

2. Purpose. — This book has been prepared to provide cryptanalysts with- an introduction 
to certain concepts and methods of the mathematical theory of statistics which are useful in 
cryptanalysis ; and to provide the reader with certain formulas, charts, and tables which have 
been found to be of assistance, in the solution of a variety of cryptanalytic problems. 

3. Arrangement of contents. — a. The book is divided into two parts. In the first part, 
there are: (1) An exposition of the underlying theory; (2) A presentation of many useful formulas; 
(3) Procedures for the use of these formulas in the solution pf problems; (4) Illustrations and 
examples. 

b. In the second part are charts and tables which will assist in the application of the methods 
discussed in part 1, and a number of appendixes presenting the mathematical development of 
formulas presented in the first part. There is also a summary of all the formulas and definitions 
found throughout the book. 

c. In keeping with the purpose as set forth in paragraph 2, no attempt has been made in 
the exposition of part 1 to present the mathematical analysis underlying the derivation of the 
formulas discussed. 



( 1 ) 



PART 1 
Section II 



GENERAL CONSIDERATIONS OF PROBABILITY 

Paragraph Paragraph 

A priori probability 4 Combinations of probabilities 6 

Statistical probability 5 

4. A priori probability. — a. A complete discussion of the mathematical and philosophical 
implications involved in a logically rigorous approach to mathematical probability is beyond the 
purpose of this book. Herein it will suffice to use the following definition of a priori probability: 

The probability that an event will occur is the ratio of the number of favorable cases to the number 
of total possible cases, all cases being equally likely to occur. By a favorable case, is meant one 
which will produce the event in question. 

b. The probability for the occurrence of an event is always a positive fraction not exceeding 1 . 
The numbers “1” and “0” are taken to represent certainty, since in those circumstances every 
case is either favorable or not favorable and will produce the event in question or will not pro- 
duce the event in question. If the probability that an event will occur is p and the probability 
that it will not occur is q, then p-\-q—l. (It is certain that the event either will or will not 
occur.) 

c. In cryptography the probability of occurrence of each of the letters of the alphabet in 
various languages is of interest. It is obviously impossible to apply the preceding definition of 
a priori probability, since that would involve a study of every conceivable message that might 
be sent. In this case, which illustrates the situation most frequently encountered in practical 
statistical work, there must be introduced the concept of statistical probability. 

5. Statistical probability. — a. The fundamental basis in statistical probability is the fact 
that, for all practical purposes, the difference between the unknown a priori probability and the 
ratio of observed favorable cases to the observed total number of cases, can be made as small as 
we please by indefinitely increasing the total number of observed cases. 1 The limit of the ratio 
of the number of observed favorable cases to the total number of observed cases, as the latter 
number increases indefinitely, shall be called the probability that the event occurs. 1 

b. Thus, in order to find the probabilities of occurrence for each of the letters of the alpha- 
bet, it is necessary to examine a large amount of text. A study of 100,000 letters of TCn gtish 
telegraphic text gave the result shown in figure 1. We thus find that the probability for the 
occurrence of A is 0.07189; for B it is 0.01146; for C it is 0.03345, etc. 

c. It is usual to denote the numbers 7,189, 1,146, 3,345, etc. (i. e., the number of observed 
favorable cases) as the absolute frequencies, and the numbers 0.07189, 0.01146, 0.03345, etc. (the 
ratio of the number of observed favorable cases to the total number of observed cases) as the 
relative frequencies. 



1 See appendix A, p. 148. 
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Figuhb 1. 



6. Combinations' oT probabilities. — a. If an event Under Investigation is one of several 
mutually exclusive events, then the probability that it occurs is the sum of the probabilities of 
occurrence of each of the mutually exclusive events. 

Example 1 . — What is the probability that any one letter chosen at random from English 
telegraphic text is a vowel? Since the event in question is one of the mutually exclusive events 
“finding A, E, I, O, U, Y,” the probability sought is P,=P a +P b +Pi+P 0 +Pu+Py where 
P„ P A , Pg, Pi, Po, Pu, Pr, respectively, mean the probability for the occurrence of a vowel, the 
probability for the occurrence of A, etc. Adding the component probabilities, as found from 
figure 1, there results P, =0.39865. It may be seen from this that approximately 40 percent of 
the letters of English telegraphic text are vowels. 

b. If the event under study is the simultaneous occurrence of several events, or the succes- 
sive occurrence of several events, then the probability that it will occur is the product of the 
probabilities of occurrence of the component events, provided the occurrence of one does not 
effect the occurrence of the others — or, as we shall say, provided the events are independent. 
Thus, the probability that two letters selected at random froin English telegraphic text are 
vowels, is 0.4X0.4=0.16. 




Section III 
STATISTICS 




Paragraph 

Definitions — 7 

7. Definitions. — a. By statistical method we mean the mathematical treatment of observa- 
tional data in accordance with the fundamental laws of probability discussed in the preceding 
section. 

b. By a statistical variate we mean a variable which may assume a finite or infinite number 
of different values in accordance with a certain law of probability. The sum of the probabilities 
corresponding to each of the different values must be one. 

Example 2. — The variable d, where 6 is to represent any letter of the alphabet, is a statis- 
tical variate since 6 will assume the values A, B, C, . . . , Z with probabilities corresponding 
to the values in figure 1. 

c. In order to be able to study efficiently a mass of data, it is desirable that we be able to compute 
several numbers which will, to a certain extent, characterize the data and display its important 
properties. 

d. By & statistic we mean any number computed from observed data in accordance with 
certain rules. The following are some of the more common statistics which are used to charac- 
terize a mass of data and which there will be occasion to use in the course of this work. 

e. (1) The arithmetic mean or average of a sequence of numbers is the sum of the numbers 
divided by the number of items. 

Example 3.— What is the average of 1, 2, 3, 4, 5? The average is (l-f2+3+4+5)/5=3. 

(2) The weighted mean ox average of a series of numbers is the sum of the product of each 
number and its weight, divided by the sum of the weights. In general, in the study of observed 
data, the weight corresponds to the number of observed occurrences; in theoretical discussions, 
it corresponds to the probability of occurrence. It is usual to omit the adjective “weighted” 
since this definition reduces to the one first given. 

(3) Symbolically we may express the foregoing as follows: If the numbers x u x 2 , . . . , x m 
have, respectively, the weights w u w 2 , . . . , w„ (or occur respectively w u w 2 , . . . , w n times), 
then the average of x u x 2 , . . . , x n or symbolically x (read x bar) is given by 

-_ WiXi+WiX 2 + • • • +WnXn 
W 1 + W 3 + . . . +W n 

Example 4 . — A study of 100 sets of English text, each of 50 letters, yielded the following as 
the number of occurrences of the letter A per set. 

(4) 
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Xi 


v>* 


1 


3 


2 


26 


3 


21 


4 


19 


5 


15 


6 


8 


7 


7 


8 


1 




100 



(i. e., A occurred once in each of three sets ; twice in each of 26 sets ; three times in each of 2) sets, 
etc.). The average observed occurrence of A per set of 50 letters is therefore 



f _ (3Xl) + (26X2)+(21X3)+(19X4) + (15X5) + (8X6) + (7X7)+(lX8) „ 

3+26+21 + 19+15+8+7 + 1 

£=374/100=3.74 

(4) If a: is a statistical variate, i. e., if x takes on the values x u x 2 , ... , x„ with the corres- 
ponding probabilities p u p 2 , . . . , p,,, respectively, then the average value of z is 2=p 1 xi+p 2 x 2 + 
. . . +p n x n ■ (In this case the total weight p!+p 2 + • • . +p„=l). 

j. The mean square of a series of numbers is the average of the squares of the numbers. 
Symbolically, if x u ar 2 , . . . , x n is a sequence of numbers with corresponding weights w u w 2 , . . . , 
w„, respectively, then 



mean square x— 



W l Xi 1 +WjX 2 2 + 
Wl+w 2 + 



■ +w n x n i 

. +W B 



=W+M*+ . . . +j n x 2 

where f i =w i /(w l +w 2 + . . . +«?,) (t=l, 2 ,...,«)* 

In the foregoing w { (i=l, 2, . . . , n) is an absolute weight and / ( (i=l, 2, . . . , n) is a relative 
weight. 

g. Let x u x 2 , . . . , x n be a sequence of numbers whose mean value is x. The deviation of 
x t from the mean is x t —x. The deviation will be negative, zero, or positive according as 
Xi is less than, equal to, or greater than x. 

h. The variance of a sequence of numbers is the mean square of the deviations from the 
mean, i. e., 

variance= g = ^=^ ± ^=^± ’ ’ • + w ^n-xy 

Wl + V>2+ . . . +W„ 



(Zi - X) 2 +/ 2 (x 2 - x) 2 + . . . +fn(Xn—X ) 2 

where the x’s, w’s, and/’s are defined as above. 

The positive square root of the variance is called the standard deviation. 

It may be shown that v=j x x 2 -\-j 2 x 2 4- . . . +J n x n 2 — (£) 2 = (Mean square of x) — (square of 
the mean of x). 



2 The notation (»=1, 2, . . . , n) is a convenient way of indicating that i is to be replaced by all of the 
successive values 1, 2, 3, . . . , n, in turn. 







6 

i. In general, the average of a sequence of numbers is a central value about which the numbers 
tend to cluster; the variance is a measure of the variation about this central value.® 

* The weighted sum of the deviations from the mean is not a suitable measure of the variation because it is 
in all eases equal to zero. The following simple algebra demonstrates this fact: 

Wi(xi—S) + w 3 (xi—S)+ . . . +w„(x n —£) 

u>i+te a + . . . 

w,x i +w& 3 + ■ ■ ■ -fmnS» 2(w i + w a + ■ ■ ■ +«>») 
tCl + U)2+ • • • +u)„ + . . . +w» 

= 2 - 2=0 

The next simple possible measure of the variation about the mean is the weighted sum of the absolute values 
of the deviations (the weighted sum of the arithmetical values of the deviations neglecting the sign). Symbol- 
ically this would be written as 

mils,— i\ + w 3 \xt—i\ + • . ■ +M),|g«— 2| 

Wi+tOj+ . . . +tc. 

However, because of the fact that the variance is more amenable to mathematical treatment and because 
of its relationship with the theory of least squares and the normal probability distribution the variance rather 
than the weighted sum of the absolute values of the deviations is the more commonly used measure of variation. 
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Section IV 

FREQUENCY DISTRIBUTIONS 



Paragraph 

Generalities — 8 

Binomial distribution 9 

Normal distribution 10 



PsncHph 

Poisson distribution 11 

Modified Poisson distribution 12 

Multinomial distribution 13 



8. Generalities. — a. Some Blight experience in cryptanalysis will soon convince one that an 
outstanding characteristic of the data studied is its variation. The data which are the object of 
statistical study always display variation in one or more respects. 

b. The notion of a collection of data arranged in & frequency distribution with respect to one 
or more characteristics is fundamental in statistical work. If 7i observations originating from 
the same set of circumstances are made with respect to a statistical variate, and if the individual 
observations are arranged with respect to their magnitude, the result is said to form a frequency 
distribution; to each value of the variate, there corresponds an absolute frequency. In example 4 
there is a frequency distribution of 100 observations of the number of occurrences of the letter A 
per set of 50 letters of English telegraphic text. Subsequent discussion in this section will intro- 
duce theoretical frequen cy distributions in which to each value of the variate will correspond a 
probability instead of a definite number of occurrences. 

c. Frequency distributions may be discontinuous or continuous. In discontinuous distri- 
butions the statistical variate may assume a finite or infinite number of discontinuous values. 
(Values which are separated one from the other by finite quantities.) The distribution of the 
number of occurrences of the letter A per set of 50 letters given in example 4 page 5 is an illus- 
tration of a discontinuous distribution in which the statistical variate (the number of occurrences 
of the letter A per set) takes on a finite number of values. In continuous distributions the sta- 
tistical variate may assume aU possible values within its range of variation. In the latter case 
the frequency distribution may be expressed by stating the proportion of the data for which the 
variate is less than a given value or the proportion of the data for which the variate lies between 
given values. 

d. It is presumed that the reader is already acquainted with instances of frequency distri- 
butions, e. g., the frequency distribution of single letters, digraphs, etc., of cryptograms. 

e. The following is a frequency distribution of the lengths of words in a series of official 
telegrams; in all 10,000 words were studied. 



Number of 
words 



Number of 
letters 




390 
2, 056 
4, 107 
6,980 
7, 285 
7, 014 
7, 273 
5, 880 
4,311 



Number of 
letters per 
word 



Number of 
words 


Number of 
letters 


Ft 


XiFt 


288 


2, 880 


163 


1, 793 


86 


1,032 


25 


325 


23 


322 


4 


60 


10, 000 


51, 708 



( 7 ) 






















From this it is seen that the average number of letters per word of English telegraphic text is 
5.17. For most purposes, assuming this value to be 5 will give a sufficiently accurate approxi- 
mation. (This is one of the reasons why the arbitrary length of five characters per word has 
been adopted as standard for code or cipher text.) 

j. It is very desirable to be able to characterize by means of a mathematical formula the 
relationship between the various values that a statistical variate may take, and the corresponding 
probabilities (or frequencies). Such a formulation simplifies the study of frequency distribu- 
tions and enables valid judgments about sample distributions to be formed. The study of the 
possible formulas for frequency distributions has yielded a number of important results. 

g. We shall here restrict ourselves to five types of frequency distributions which are of 
primary importance in cryptography, viz, the binomial distribution, the normal distribution, the 
Poisson distribution, the modified Poisson distribution, and the multinomial distribution. 

9. Binomial distribution. 4 — a. The binominal distribution is the first example of a theo- 
retical distribution to be established, and was discovered by Jacob Bernoulli about the end of 
the seventeenth century. It can be shown that if the probability that an event occurs is p, 
and the probability that it does not occur is g, (g=l —p), then, if n independent observations 
are made, the probability that the event occurs exactly 0, 1 , 2, . . . , n times is given by the 
respective term of the expansion of the binomial 

(9.1) (g + ff) " = g" 4 n-Z'^P + U( i ^i^ 2 P 2 + U xJ 2) (J *~ 3 P 3 + • ’ ' +pH 



Thus, the probability that the event occurs 0 times in n trials is P 0 —(f; the probability 
that the event occurs exactly one time in n trials is P l ^=ng n l p-, the probability that the event 



occurs exactly two times in n trials is P 2 = 



n(n-l) . : 



1X2 



<z *-y» 



; the probability that the event 



occurs exactly x times (x an integer) in n trials (ign) is 

P _ n(n-l)(n-2) ■ . . (n-s+1) nt 

1X2X3 ... Xx 2 p x\( n-x)P v 



where x\ (read x factorial) is equal to x(x—l)(x—2) ... 1. 

Example 5 . — Using 0.1 as the probability for the occurrence of T in English text, what is 
the probability that T occurs zero times, exactly one time, exactly two times, . . . , exactly 
eight times in a set of 100 letters of English text? In this case p=0.1, g=0.9, «=100, so that 
the desired probabilities are: 

that T occurs zero times (0.9) 100 = 0.0000 =P 0 

that T occurs exactly one time 100(0.9) w (0.1)=0.0003=Pi 



that T occurs exactly three times — ^ g >( 3 ^ (O-Q) 97 ^. 1 ) 3 = 0.0059=P g 
that T occurs exactly four times ' ’*’^^^^^^^ (9.9) 96 (0.1) 4 =0.0159=P« 
that T occurs exactly five times ^^j^|^^|^|^^ ^(0.9) 95 (0.1) li =0.0339=Pt 



4 See appendix A, p. 14811. 
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that T occurs exactly six times (Q-9) M (0.1 ) 6 = 0.0596 =P t 

that T occurs exactly seven times iX2X3X4 ' X5X6 ~ X7 (0.9)“(0.1) — 0.0889=P 7 

that T occurs exactly eight times 

100X99X98X97X96X95X94X93 D 
1X2X3X4X5X 6X7X8 ( °' 9) 

b. To find the probability that an event, whose possible occurrences are distributed in ac- 
cordance with the foregoing distribution, occurs at least r times it is merely necessary to add the 
probabilities that the event occurs exactly r, r+1, r+2, . . . , n times. If then we use P(r) to 
represent the probability for at least r occurrences we have 



P(r)=£; 



n\ 



t5 »-y=SP,=l-Sp/ 

i-r i-O 



x\ (n—x)\ 

• •. 4k.. 

n 

(The symbol ^ means the sum of the terms for all integral values of x from r to n inclusive.) 

x**r 

Example 6 . — Using 0.1 as the probability for the occurrence of T in English text, what is 
the probability that T occurs at least six times in a set of 100 letters? In order to find the desired 
probability it is necessary to subtract from 1 the sum of the probabilities that T occurs exactly 
0, 1, . . . , 5 times. Using the values found in example 5, we have 

P(6) = 1- (0.0000 +0.0003 +0.0016 + 0.0059 + 0.0159 + 0.0339) 

= 1—0.0576—0.9424 

c. For a statistical variate which takes on its possible values in accordance with the law of 
distribution given by the binomial distribution, it may be shown that the mean value =p—np, 
the mean square=/i a =« 2 :P J +ttj J <Zi and the variance =a t —npq. (See Appendix A, p. 14811). 

Example 7 . — Let us take as the probability for the occurrence of A in English text p= 0.072. 
Then, the theoretical average value for the number of occurrences of A in a set of 50 letter^ of 
English .text is ^=np=50(0.072+=»3.6; the theoretical value of the mean square of the number of 
occurrences 0* 2 ) is M4 =nV+7!p2=(50) i! (0.072)*+50(0.072)(0.928) = 12.96+3.34=16.30; the the- 
oretical value of the variance (a 2 ) is ff 2 =njB2=50(0.072)(0.928)=3.34. (In general, we shall use 
Greek letters for theoretical values and Roman letters for the corresponding observed values.) 

Example 8 . — It will be of interest to compare the theoretical values derived in example 7 
with the observed values obtained from the observed occurrences of A in 100 sets of English text 
of 50 letters each, already considered in example 4. In example 4 it was found that r=3.74. The 
mean square of the number of occurrences is given by (see p. 5). 



m 2 = 



3X1 2 +26 X 2 2 +21X3 2 +19 X4 2 +15 X 5 2 +8 X 6 2 + 7 X 7 2 +1X8 2 1 670 



3+26+21 + 19+15+8+7+1 



100 



=16.70 



To find the variance we use the fact that variance = (mean square)— (square of mean), or a 1 - 
M 2 -M 2 - Thus s 2 =16.70—(3.74) 2 =16.70— 13.99=2.71. 



* Since p+g=l, (9.1) could be written as 1°»=1 

i- o 
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A comparison of theoretical and observed values yields 





Theoretical 


Observed 


Mean (p) 


3. 60 


3. 74 


Mean square 0* a ) 


16.30 


16. 70 


Variance (<r 2 ) 


3. 34 


2.71 


Standard deviation (<r) 


1.83 


1. 65 



d. It should be clear that the values of the observed means of a sequence of samples will 
also be distributed in accordance with a certain law of distribution not necessarily the same as 
the law of distribution of the original observations. The distribution of means of samples of 
N from a population 6 distributed according to the terms of (q-{-p) n is given by the corresponding 
terms of (q+p) nef plotted to l/N times the unit of the original binomial, i. e., the probability 
that the mean takes the value 0, l/N, 2/N, 3 /N, . . . , nN/N is given by the corresponding 
term of the expansion of (q 1 rP) nN - 

e. The mean of the distribution of means is given by np and the variance of the distribution 

of means is given by The latter equation shows us then, that if a 2 be the variance of 



a number of observations, the variance of the mean of N such observations is This 

last result signifies that the sample means will show a smaller variation about the true (or 
population) mean than will the original observations. More exactly we may say that the mean 
of N observations is y'N times as reliable as any of the N original observations. 

j. In order to apply the binomial distribution to numerical cases, it would be desirable that 
there be available tables giving the values of the several terms of the expansion of (<Z+p)* for 
various values of p and n. Unfortunately, such tables do not exist. However, since there are 
tables for other distributions, which will provide sufficiently close approximations to the binomial 
distribution for all our purposes, the lack of tables for the binomial distribution will not greatly 
inconvenience us. 

10. Normal distribution. — a. In the case of the binomial distribution, we saw that the 
statistical variate took on only integral values. However, for the distribution now to bo con- 
sidered^ such is not the case. A statistical variate is said to be normally distributed when it 
takes on all values between — <» (minus infinity) and +co (plus infinity) with frequencies such 
that the logarithm of the frequency at any distance X from the mean of the distribution is less 
than 'the frequency at the mean of the distribution by a quantity proportional to X*. A more 
precise expression of the foregoing is the following: The statistical variate normally distributed 
takes on all values between — <» (minus infinity) and + 00 (plus infinity) in accordance with 



the following law of probability: The probability that the statistical variate lies between X— ~ 



and X+£> where e is a very small number is given by 
(10.1) p(X, e)=-W^-^’ 

* By population we here mean the idealized aggregate of data from which the sample is supposed to have 
been drawn by chance. 
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In the preceding formula there are two parameters, 7 p and <r. It may be shown that p and a 2 
are the mean and variance respectively of a statistical variate with the normal law of distribu- 
tion. (Hence the importance of the mean and variance or standard deviation, since a knowledge 
of them is all that is necessary completely to determine the normal law of probability.) In (10.1) 
X— p is the distance of the observation X from the mean p and a measures in the same units 
the extent to which the individual observations are scattered. 

b. For purposes of tabulation, it is usual to treat ((,X—n)/a)—x as the variate and to omit the 
factor t/a in (10.1); thus, in part 2 will be found tables giving the values of y for various values 
of a; in accordance with the formula 



( 10 . 2 ) 




The curve corresponding to the formula (10.2) is the familiar normal probability curve, given in 
diagram 1 herewith. Geometrically, a is the distance on either side of the mean (or center) of 
the steepest points, or points of inflection of the curve. 



I 



p 



I 




Diagram 1. 





Normal Probability Curve: x — — ~ 

c. In practice it is more often necessary to know the probability, that a statistical variate 
satisfying the normal law, lies between two values say X 0 and X u where JST,>X 0 . Tables have 
been calculated to enable this to be done readily. If we set Xi=(Xi~n)/a and Xq— (X q — f) I < t, the 
desired result is given by * 

(10.3) P(x o, Zj) ==-J== f die-*'* 

V AfJn 

The tables that have been calculated (shown in part 2) are for the value x 0 = — 00 ; that is, the 
tables give the probability that x is less than or equal to x t . In order to obtain the result desired , 
use must then be made of the formula 

(10.4) P(Xo, Xi) = P(-®, Xi)~ P(— Xq) 

7 A parameter is a "variable constant" which enters into a mathematical formula. Thus in (10.1) n and <t 
are constant for a given population but take on different values for different populations. 

* The symbol ' (read the integral from Zo to Xi) may be traced back to the 8 of the word Sum. In essence 

the integral is the limit of the sum of the values of the integrand (the expression to be integrated) as x takes on 
values, between xo and xt, which differ by smaller and smaller amounts. Thus the discussion in paragraph 10c 
is conceptually similar to the discussion in paragraph 9c. 
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A graphic description of the above will help clarify the matter. Assuming the total area 
under the curve to be unity, then the shaded area in diagram 2 is that which is desired in accord- 
ance with (10.3). 




The values that have been tabulated correspond to the shaded areas in diagrams 3a and 3 b. 





By subtracting the area shown in diagram 3f> from that shown in diagram 3a, we get the 
desired area of diagram 2. 

d. For the normal distribution 68 percent of the observations lie within a range of ±<r about 
the mean; 95 percent within a range of ±2<r about the mean; 99.7 percent within a range of 
± 3 o- about the mean. 

e. The means of sets of N observations distributed in accordance with the normal law of 
probability are also distributed normally ; their mean is the same as that of the original observa- 
tions, but with variance l IN as large; i. e., if the mean and variance of the original distribution 
are n and <r 2 respectively, then the mean and variance of the distribution of means are n and <x 2 /N 
respectively. The remarks made in paragraph 9 d apply here too. 
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/. If in the binomial distribution p and q do not differ greatly and if n is large, then that 
distribution is given with a sufficient degree of approximation by a normal distribution with 
mean equal to np and variance equal to npq\ i. e., under the conditions set forth above 



n(n — 1) (p — 2) • 
1X2X3X 



(n— x+1) , 

s, L t *v = approx. 



J2itnpq 



0 - ( z-npv/2npt 



S n(n— 1) • • 
1X2X 



(n-x+l)~ n _ x 



-- approx 



j_ (V 
• V2xJ-“ 



~* /2 dx 



where f= (r — np)/npq 



g. To indicate the approximation of the binomial distribution by the normal distribution, 
there are listed on page 18 corresponding values as calculated from the binomial distribution, 
for n=64, j p=K, and as given by the normal distribution.* (In the normal distribution we use 
P=np=32, and c 2 ~npq=l6). 

Example 9. — What is the probability that in a set of 100 letters of English text, the number 
of vowels is between 35-45, inclusive? Taking as the probability for the occurrence of a vowel 
p=0.40, there is obtained from the binomial distribution, v ,=np= 40 and <r 2 —npq= 24. Zo= 

— 1-02, Xi=^— -=t^qSK= 1-02. From the table of the normal distribution, 
a 4.809 a 4.899 

it is found that P(-«>, 1.02)=0.8461 and P(— «, -1.02)=0.1539 so that P(-1.02, 1.02)= 
0.8461—0.1539=0.6922. In other words, about 70 percent of sets of 100 letters each of English 
text will have between 35 and 45 vowels, inclusive. 

h. Using the method employed in example 9, limits were calculated within which the 
number of vowels (A, E, I, O, U, Y), high-frequency consonants (D, N, R, S, T), medium- 
frequency consonants (B, C, F, G, H, L, M, P, V, W), and low-frequency consonants (J, K, 
Q, X, Z) would be expected to lie for messages up to 200 letters in length. The results have 
been graphed and may be found in charts 1, 2, 3, and 4. (See pp. 14, 15, 16, and 17.) 

In chart 1, curve V x marks the lower limit of the number of vowels to be expected in a 
message of given length; curve V 3 marks the upper limit. Thus, for example, in a message of 
100 letters in plain English there should be between 33 and 47 vowels. 

In chart 2, curves Hi and H t mark the lower and upper limits as regards the high-frequency 
consonants. In a message of 100 letters there should be between 28 and 42 high-frequency 
consonants. 

In chart 3, curves Mi and M 3 mark the lower and upper limits as regards the medium- 
frequency consonants. In a message of 100 letters there should be between 17 and 31 medium- 
frequency consonants. 

In chart 4, curves L x and L 2 mark the lower and upper limits as regards the low-frequency 
consonants. In a message of 100 letters there should be between 0 and 3 low-frequency 
consonants. 

• These values are taken from Yule, G. U., An Introduction to the Theory of Statistics, 9th Ed. Rev. 
London, 1929, ch. XV. 
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Approximation of the Binomial Distribution by the Normal Distribution 



11. Poisson distribution. 10 — a. In both the binomial and normal distributions, it was seen 
that there are two parameters that play important roles; n and p in the binomial distribution, 
and n and <r in the normal distribution. In the distribution now to be considered there enters but 
one parameter. 

b. The Poisson distribution, known also as the Law of Small Numbers, the Law of Small 
Probabilities, and Poisson's Exponential Law, relates to a statistical variate which takes on 
positive integral values only, (0, 1, 2, . . . ). According to this distribution, the probability 
that an event occurs zero, one, two, three, . . . x, . . . times is given by the corresponding term 
of the sequence 



18 See appendix B, p. 149 ff. 
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Chart No. 6. — Poisson Exponential 
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where x\ is factorial x, i. e., x(x— l)(x— 2)(z— 3) ... 1. The parameter m that enters into the 
distribution is the mean of the statistical variate. 

c. The mean and variance of a statistical variate distributed in accordance with the Poisson 
distribution are equal i. e., m=<7 2 . This may serve as an indication, but not a conclusive one, as 
to when this distribution may be used. 

d. In paragraph 10/ it was stated that the normal distribution will serve as an approximation 
to the binomial distribution if n is large and p and q nearly 0.5. If however, p (or q) is small, 
and n large, the Poisson distribution will provide a good approximation to the binomial dis- 
tribution. 

e. This distribution will be very useful in cryptanalysis since most of the probabilities that 
the cryptanalyst will consider are small. To facilitate the use of the Poisson distribution, tables 
have been prepared for this distribution for values of m from 0.1 to 15 by tenths and for the pos- 
sible values of the statistical variate. These tables will be found in part 2. (See pp. 136-144). 

For convenience in certain problems some of the tables have been prepared in graphic form 
and will be found in charts 5, 6, and 7. On the horizontal axis is plotted the value of the mean 
and on the vertical axis is plotted the value of the probability. The curves drawn are for 0, 1, 
2, . . . , 11 occurrences. Thus in order to find the probability for three occurrences in a Poisson 
exponential with mean 6 one proceeds as follows: Find the value 6 on the horizontal or m axis; 
follow this value vertically until the curve / 3 is met; then proceed horizontally to the left where 
the value P=0.09 is found. 

J. To indicate the approximation of the binomial distribution by the Poisson distribution, 
there are listed below values as calculated from the binomial distribution for 7i=50, p=0.01 
and the corresponding values given by the Poisson distribution for m—np= 0.5. 
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Example 10 — -A study of 100 sets of 50 letters each of English text yielded the following 
observed distribution for the number of B’s per set of 50 letters: 
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(i. e., there were no B’s in 66 of the sets, one B in each of 29 of the sets, and 2 B’s in each of 5 
of the sets). Compare this with the theoretical distribution to be expected according to the 
binomial distribution and the Poisson distribution, if p=0.01 is taken as the probability for the 
occurrence of B. Since 100 sets were observed, it is merely necessary to multiply the proba- 
bilities derived above for the binomial and the corresponding Poisson distribution by 100, in 
order to get the theoretical number of occurrences (or theoretical absolute frequencies). There 
is thus obtained : 
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12. Modified Poisson distribution. — a. It may be shown that under certain conditions any 
discontinuous frequency distribution, for which the variate takes on integral values, may be 
expressed as the sum of an infinite series of terms consisting of the Poisson exponential and its 
finite differences. That is to say if F(x) (x=0, 1,2,...) represents a discontinuous frequency 
distribution then 

F(x) = P (x,m) + Ci A 2 P(x,m) + c 3 H 3 P (x,m) + . . . 

where 

P(x,m)=e~ m m z /x\ (x=0, 1, 2, ... ) 

AP(x,m)—P(x,m) — P(x—l,m) 

A 2 P(x,m) = A P ( x,m ) — AP (x— 1 ,m ) 
etc. 

and m and c 2 , c 3 , , . . are determined by F(x). The foregoing series is known as the Poisson- 
Charlier frequency series or Charlier's type B frequency curves. 

6. It has been seen thus far that the application of the binomial distribution is greatly 
aided by the fact that for values of p and q nearly 0.5 and n large, the normal distribution 
offers a suitable approximation, and that for p (or q) very small and n large, the Poisson dis- 
tribution offers a good approximation. In order to find a suitable approximation to the binomial 
for intermediate values of p it is necessary to modify the Poisson distribution slightly. A satis- 
factory modification for this purpose is obtained by taking the first two terms of the series 
described in the preceding subparagraph. 

c. According to this modified Poisson distribution, a good approximation for the probability 
that a statistical variate take the positive, integral value x under the conditions discussed in 
paragraph 126, is given by 



( 12 . 1 ) 



n\ 



x \( n - x )f *P x =approx. e m m x /x !-^-A 2 e‘ 



n m z /x\ 



A e~ m m x jx\=e~ m m x lx\—e m m z ~ 1 / (x— 1) ! 



where 
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and A i e~’ n m x lx\=£e~’ H in t lx \ — A e~ m tn x ~ i /(x — 1)! 

The values of A e~ m m z jx\ and A t e~ m m x lx\ are easily obtained from the tables of the Poisson dis- 
tribution by subtracting consecutive values. 

d. To illustrate (12.1) consider the case for n=100 and p=0.1, so that m=np=10, and 
np 2 /2=0.5. 

In the following, the values in column 2 are taken directly from the tables of the Poisson 
distribution for m= 10. The values in column 3 are obtained by subtracting from the correspond- 
ing value in column 2 the one just above it. The values in column 4 are obtained by subtracting 
from the corresponding value in column 3 the one just above it. The values in column 5 are 
obtained by multiplying the corresponding values of column 4 by np 2 /2=0.5. Finally, column 
6 gives the difference between the corresponding values of columns 2 and 5. 




1 The value of t~ m m x lx\ for x a negative integer is zero. 

* Even though these values come out negative they must be considered as 0.000000 since a negative proba- 
bility has no meaning. 
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e. Let us now compare the corresponding values as given by the binomial distribution with 
«=100, p=0.1, the related Poisson distribution, and the modified Poisson distribution as just 
derived. The values for the binomial are taken from A. Fisher, Mathematical Theory of 
Probabilities, p. 268. 
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Figure 2. 



Example 11 . — A study of 100 sets of 100 letters each of English plain text yielded the 
following as the distribution of the occurrences of T. 





(i. e., there were 2 T’s in 1 set of 100 letters; 3 T’s in each of 2 sets of 100 letters each; 4 T’s in 
each of 2 sets of 100 letters each, etc.). Compare the the above distribution with the theoretical 
distribution to be expected according to the binomial, Poisson, and modified Poisson distributions, 
taking as the probability for the occurrence of T, p=0.1. Since 100 sets were observed it is 
necessary to multiply the probabilities derived in figure 2 by 100 to get the theoretical absolute 
frequencies. There thus results 
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13. Multinomial distribution. 11 — a. The multinomial distribution is an extension of the 
binomial distribution. In the binomial distribution the possible event considered was one of 
two mutually exclusive categories: The event either did or did not occur. In the multinomial 
distribution the possible event may be one of r mutually exclusive categories with the respective 
probabilities of occurrence p u p 2 , . . . , p r where Pi+p 2 + . . . +p,= l. 

b. If an event may occur in one of r mutually exclusive ways with the corresponding proba- 
bilities jh, p 2 , p T where p t +p 2 + . . . +p T =l, then in n observations the probability 
that the event has occurred exactly x, times the first way, exactly x* times the second way, 
. . , exactly x, times the r th way where x 1 -\-x 2 + . . . -\-x,—n is given by 



P(X,, Xj, 



, *r) = 



n\ 



X|!x*! . . . x r ! J 



iPx l Pi % • • • Pr r 



The sum of the foregoing expression for all positive integral values of x h x 2 , . . . , x T such that 
Z 1 +X 2 + ■ ■ • +x t =n is (p x +z> 2 + . . • +Pr)“. 

The binomial distribution is thus seen to be the preceding for r — 2 with Pi—p and p 2 =q. 

c. If the possible event be the selection of a letter from English telegraphic text then r==26 
and the values of pi, p 2 , . . . , Pu are those listed in figure 1. The multinomial distribution 
will thus give the probability that in a selection of n letters of English telegraphic text there 
are exactly x x A’s, x 2 B’s, . . . ,x M Z’s where Xi+Xa+ . . . +x 28 =n. 

d. It may be shown that for the multinomial distribution E{xd)—np i } i 



E(x t 2 ) =w?p?+np t (l — pi) 

E(x,x i )=n(n—l)p i p ) =E(x,)E{x i )—npiP, (i^j; i,j=l, 2, . 

= E{ Xi ) E( Xj ) - g(z< y (3:j) =~E{x t ) E(x,) >* 



• >r) 



n 



n 



11 See appendix C, p. 160. 

13 E( ) means the expected or average value of the expression in the parenthesis. 

1J For events which are independent in the sense of probability E(xat) = E(xi) E{xj) . 
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The <j> test for non-random character of text 18 



14. Repetitions. — The importance of the role played by repetitions in the analysis of 
cryptograms is well understood, even by the amateur cryptanalyst. Repetitions in cryptographic 
text are basically of two sorts — causal and accidental. Causal repetitions are those which repre- 
sent the encipherment of plain-text repetitions which have undergone the same cryptographic 
treatment. Accidental repetitions are those, which, through fortuitous circumstances, are the 
encipherments of different plain-text elements. In the case of most cryptograms of the substi- 
tution class, the finding of repetitions of sequences of fair length, say four, five, or more char- 
acters, usually leads to solution; because as the lengths of repetitions increase it becomes more 
certain that such repetitions are causal and not accidental in nature. However, it often happens 
in the case of the more complex types of cryptograms that repetitions are rather scarce and such 
as are found are short. In such cases it becomes very important to be able to judge whether the 
repetitions which are present are causal or are accidental. In the- following we shall consider 
certain procedures and tests which will be of service in the evaluation of the cryptographic sig- 
nificance of repeated cipher elements. 

15. Expected number of blanks in random, text. — a. By random text is meant text in which 
the interplay of those factors which give rise to a particular cipher element is such that the 
cipher elements will occur with approximately the same probability, e. g., the cipher text pro- 
duced by a polyalphabetic substitution of say 10 different alphabets would be random text 
insofar as the individual letters of the cryptogram were concerned. The uniliteral frequency 
distribution of such text would be “flat,” i. e., there would be no pronounced crests and troughs. 

b. Suppose there is at hand a selection of random text of N elements of a system in which 
there are n different elements possible, e. g., the text may consist of A T — 50 letters of an n — 26 
letter alphabet; or we may consider a text of N— 376 digraphs where there are n=676 different 
possible digraphs, etc. Then the probability for the occurrence of a particular element is 1/n. 
Not all of the n possible elements will necessarily occur in the text of N elements, and the number 
which does not appear is sometimes of significance. To take advantage of that number it would 
be necessary to know the theoretical distribution of the number of blanks, i. e., of the number 
of elements which do not appear. This distribution has been found to be 



where P 0 ( r ) represents the probability that there are exactly r blanks. 

(24) 
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c. The values of (15.1) for n=iV= 10 are as follows: 



r 


Po(r) 


r 


Po(r) 


0 


0. 000362880 


6 


0. 017188920 


i 


. 016329600 


7 


. 000671760 


2 


. 136080000 


8 


. 000004599 


3 


. 355622400 


9 


. 000000001 


4 


. 345144240 






5 


. 128595600 




1. 000000000 



A study of 200 sets of 10 random digits each, yielded the following as the distribution of the 
number of blanks per set of 10 digits. 



Number of 
blanks 


Theoretical 

frequency 


• 

Observed 

frequency 


• *' ' - ' 


r 


200 To (r) 


/ 


rf 


0 


0. 08 


0 


0 


i 


3.26 


8 


8 


2 


27.22 


22 


44 


3 


71. 12 


72 


216 


4 


69. 02 


72 


288 


5 


25. 72 


21 


105 


6 


3.44 


4 


24 


7 


. 14 


1 


7 


8 


0.00 


0 


0 


9 


0. 00 


0 


0 




200. 00 


200 


692 



From the foregoing it is seen that the observed average number of blanks per set of 10 digits 
is 692/200=3.46. 

d. The average (or expected) number of blanks in a frequency distribution of random text 
of N elements of a system in which there are n different elements possible is given by 14 

(15.2) B N —n(l — llri) K 

For large values of n a good enough approximation is given by 

(15.3) B N ^ne~ N/n 

where e=2.7183 is the base of natural logarithms. For particular values of N and n, the value 

14 The value in (15.2) may be derived from the distribution given by (15.1) in accordance with the definition 
of the mean. However, the following simple considerations will lead to the same result. The probability that 
a particular element does not appear is (1 — 1/n). In N observations, the probability that a particular element 
has not occurred is (1 — l/n) y . Since there are n different possible elements, the expected number of blanks 
is as in (15.2). 
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of B n may be found from tables 15 of e~*. For n= 26, i. e., for monographic distributions a 
chart has been prepared whereby the value of Bit may be readily found for values of N from 
0 to 200. This chart, No. 8, will be found on page 30. 

Example 12 . — How many blanks are to be expected in the digraphic distribution of a random 
text of 100 digraphs? In this case iV=100 and n=676. Thus B J00 =676e“ ,oo/ ® 7a ; 100/676=0.148; 
e -o i48_- o.861 ; 676X0.861=582 or there are to be expected 582 blanks or 676—582=94 different 
digraphs. 

e. For large values of n (say n S 26) it may be shown that the value in (15.1) is to a sufficient 
approximation given by 

(15.4) p o{j . )= -^—. e -r N , 

In other words, the distribution of the number of blanks in random text of N elements of a system 
in which there are n elements possible is given by the binomial distribution with p=e~ N,n and 
n=n, so that n=^ne~ N,n and o 2 =ne~ N,n {l— e~ N,n ). 

16. Expected number of blanks in non-random text. — a. By non-random text is meant text 
in which the elements have been properly allocated in accordance with their cryptographic treat- 
ment. Thus, the text of a cryptogram enciphered polyalphabetically with 10 alphabets, although 
random text in so far as the individual letters are concerned when considered as a whole, is non- 
random text when each letter is allocated to the proper alphabet. The text of a Playfair Cipher, 
for example, is non-random text when divided up into digraphs. Monoalphabetic text is an 
example of non-random text, closely akin to plain-text. 

b. Suppose that the n possible elements of non-random text have different probabilities 
of occurrence, e. g., for monoalphabetic systems in English, the different probabilities of the 
various letters are those given in figure 1 ; for digraphic systems the different probabilities of the 
various digraphs are those given in section VIII. Let these n probabilities be p u p 2 , ... , p n . 
In the following discussion the values of the probabilities only are of importance and not the corre- 
spondence between certain plain-text elements and certain probabilities. In other words from a 
statistical viewpoint plain-text and non-random text are the same. If a text of N elements is 
considered, then all n possible elements will not necessarily appear. The theoretical distribution 
of the number of blanks is known for this case also. 

c. If P 0 (r) represents the probability that there are exactly r blanks, then it may be shown 

that 



(16.1) 



p 0 (o)=i— Z)(i— (i— pt—vi ) N — /n ^ (i —p,—Pj—pt) N + . . . 

i=»l 1 oU,i,k = 1 

p*a)=ib{i-Pi) N -£ l (i-Pi-p J ) N +~, s a -Pi-pj-p*)*- . . . 

i-i i,j=i ^i.rr-1 

Po(2)^l±(i-p t - P) r- i jtji-Pi-Pi-P*)*- . . •} 



etc. 

No special cases of (16.1) have been evaluated. If in (16.1) P\=Pt= ■ . • =p„=l/n, then (16.1) 
reduces to (15.1) as is to be expected. 

“ Smithsonian Physical Tables. 7th Ed. Rev., pp. 48-53. The /« curve of Chart No. 5 may also be 
employed, since it is in reality the graph of 
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d. If the n possible elements of a system have the probabilities of occurrence p lt Jh, . . . ,p n 
respectively, then the average number of blanks in a text of N elements is given by 16 

(16.2) * B N =(i- Pl r+(i- P2 r+ . . . +(i- Pn ) N 
A good approximation to the formula in (16.2) is given by 

(16.3) B N =e^ 1 +e-*> 2 + . . . +e~"* n 

e. Using the values of p t (i=l, 2, , 26) for English text given in figure 3, (16.3) yields 

for the number of blanks in monoalphabetic (or plain) text, for values of N from 10 to 200, the 
results shown in figure 4. 



Pl= 


0.07189 


Pio=0.00198 


Pn= 


0.05754 


Pa = 


.01146 


p n = .00353 


P 20 = 


.09042 


Pz= 


.03345 


p 12 = .03549 


Pi\ = 


.02993 


Pt= 


.04290 


Pi 3 — .02534 


P22 = 


.01340 


Ps = 


.12604 


Pu=- .07558 


P23 = 


.01401 


p&= 


.02994 


Pi5= .07408 


P2i= 


.00469 


Pl= 


.01795 


Pi 6 = .02661 


P2i = 


.02099 




.03287 


pn= .00318 


P26= 


.00101 


Pi= 


.07592 


Pis= .08256 










Figure 3. 







N 


Average number of blanks 


N 


Average num- 
ber of blanks 


Theoretical 


Observed 


Theoretical 


10 


18. 40 


18. 50 


no 


5. 64 


20 


14. 27 


14. 13 


120 


5. 46 


30 


11.71 


11. 55 


130 


5.21 


40 


10. 06 


10. 03 


140 


5. 04 


50 


8. 86 


8. 84 


150 


4. 88 


60 


7. 95 


7. 98 


160 


4. 78 


70 


7.28 


7. 33 


170 


4. 67 


80 


6. 75 


6. 74 


180 


4. 56 


90 


6. 28 


6. 29 


190 


4.44 


100 


5. 98 


5. 83 


200 


4.40 



Figure 4. 



The observed values were obtained as the averages of 100 sets of text of 10, 20, ... , 100 
letters each. In view of the excellent correspondence between the observed and theoretical 
values, it was deemed unnecessary to continue this check for the cases 2V=110 to 200. The 
actual distributions of the observed number of blanks is given in figure 5. 

16 The value in (16.2) may be derived from the distribution given by (16.1) in accordance with the definition 
of the mean. However the following simple considerations will lead to the same result. The probability that the 
i th (i=l, 2 ,...,») element does not appear is (1 — p,). The probability that the i th element does not occur 
in N observations is (1— pi) N . The expected number of blanks is thus as given in (16.2). 



63301 — 38 3 







'Mv, 




j. The graphs for the number of blanks given by (15.3) and (16.3) for monoalphabetic 
distributions in English have been plotted on one chart, chart number 8. Thus, given a text 
of N letters, one can estimate whether or not the text has been enciphered monoalphabetically, 
by comparing the observed number of blanks with the expected number of blanks in a text of 
N letters for both random and monoalphabetic text. The chart will be found on page 30 and 
also on page 163. A more accurate test as to whether or not the text were random would be 
to see whether the observed number of blanks could reasonably arise from the distribution 
given by (15.4). 

g. The corresponding results for French, German, Italian, Portuguese, and Spanish are 
given below, in figure 6, the values of p f used are given in Section VIII. Charts have been pre- 
pared so that the average number of blanks may be readily found for values of N from 10 to 
200. These charts, charts Nos. 9, 10, 11, 12, and 13, will be found on pages 31-35 and also on 
pages 164-168. 
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Theoretical average number of blanks 


N 


French (25 
letter 
alphabet) 


German 


Italian (21 
letter 
alphabet) 


Spanish 


Portuguese 
(24 letter 
alphabet) 


10 


17. 87 




EM 


16. 72 


16. 58 


Hi 


13. 99 






12. 75 


12. 81 


■LI 


11. 59 




7. 53 


10. 42 


10. 39 


40 


9. 99 


10. 01 


6.04 


8. 78 


8.84 


50 


8. 85 


8. 77 


4. 98 


7. 59 


7. 73 


60 


7. 99 


7. 74 


4. 18 


6. 69 


6. 90 


70 


7.31 


7. 18 


3. 57 


5.98 


6. 24 


80 


7.01 


6. 63 


3. 07 


5. 38 


5. 70 


90 


6.25 


6. 20 


2. 66 


4. 89 


5. 26 


■n 


5. 93 


5.80 


2. 33 


4.41 


4. 90 


— 1 


4. 65 


4. 69 




3. 02 


3.64 


EM 


3. 97 


4. 35 




1. 22 


2. 99 



Figure 6. 



17. Expected number of elements occurring r times each. — a. Results similar to those 
derived for the number of blanks are obtainable for the number of elements each of which occurs 
once, twice, three times, etc. Although the exact theoretical distributions have been found for 
each case, they will not be given here. 

b. For random text of N elements, where there are n different possible elements, the average 
number of elements occurring once each is given by 

(17.1) Nd-lln)*-' 

the average number of elements occurring twice each is given by 17 

17 It N^>n, it is certain that some elements will occur more than once. If N^n it is possible that no element 
may occur more than once. Let us accordingly consider the problem, “In random text of N elements, where 
there are n elements possible and N^n, what is the probability that at least one element occurs more than once?” 
The various possible forms that the distribution of the N elements may assume are given by the terms of the 
expansion of the multinomial (pi +Pa+ . . . +p*) rf where pi=pa = . . . =p„= 1/n. The required probability 
is the sum of all those terms which contain at least one exponent greater than one (or the required probability 
is one minus the sum of all those terms having every exponent equal to one). Since N'S.n the number of terms 
in which every exponent is one is n\/N\{n—N)l or the combination of n things taken N at a time. In accord- 
ance with the multinomial distribution, a sample of one of these terms is 

N\ 

111! . . . l! plPs • • • pv - 

Since pi = p 2 = . . . = p»= 1/n we have that the sum of all those terms with each exponent equal to one is given by 

n! m_ n! 
lV!(n — IV)! »* (n — N)\n N 

Accordingly the probability that at least one element occurs more than once is given by 1— n!/(n— AT) In* 7 . For 
large values of n a good approximation to «!/(n — N) \n N is given by or the required probability is given 

by 1 — e-- w w-i>/jn_ As an example consider a random text of 100 letters. What is the probability that a digraphic 
distribution of the text will show at least one digraph occurring twice? Since there are 99 digraphs in the 
100 letters, AT=99, n=676. Thus, the required probability is l-e«*»8/jx # ?*. 99 X 98 = 9702; 2X676=1352; 
9702/1352 = 7.2, e _7J =0.0007; 1 — e _7J =0.9993. It is practically certain that at least one digraph will occur 
more than once. For trigraphs the values are N= 98, n=17,576. Thus, 98X97=9506; 2X17,576=35,152; 
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(17.2) , " N(N~l)n(l - 1 /m) w - 2 /» a -2! 

the average number of elements occurring r times each is given by 

(17.3) iV(iV-l) . . . {N-r+\)n{\-\ln) N -'ln T -r\ 

For large values oi n (17.1), (17.2), and (17.3) may respectively be approximated by 

(17.4) n(N/n)e~ N/n 

(17.5) : n{Nln)\ll2^e~ Nln 

(17.6) n(Nlny(l/r\)e- N/n 

The numerical values of (17.4), (17.5), and (17.6) for special cases may be easily found by 
means of the tables for the Poisson Exponential distribution, wherein are given the values of 
(1/r!) (N/n) T e~ N,n for values of r from 0 to 37 and for values of m=Nln by tenths from 0.1 to 
15 or from Charts 5, 6, and 7. 



Chart No. 8. — Expected Number of Blanks English Plain Text (P) and Random Text (R) 




11 — Continued. 

9506/35,162= 0.27; e-°- J7 = 0.7642; 1— e _flJ7 = 0.2358. In other words, about 24 out of 100 such selections will 
show at least one trigraph occurring more than once. For tetragraphs, N= 97, n=456,976 so that 97X96=9312; 
2X466,976= 913, 952, ' 9312/913, 952=0.01; e-« "‘=0.98; l-«-ooi=0.02. In other words about 2 out of 100 such 
selections would show at least one tetragraph occurring more than once. For pentagraphs N= 96, n— 1 1,881,376 
so that 96X95=9120; 2X11,881,376= 23,762,752; 9120/23,762,752= 0.0004; e-° ™M =0.9996; i_ 6 -o.ooo4 =0 .0004. 
In other words it is almost certain that sueh a selection of text would not show a single pentagraph (or for that 
matter, a polygraph of more than five letters), occurring more than once. 
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Chart No. 9. — Expected Number of Blanks French Plain 'Text 

FRENCH 

(aS LETTER ALPHABET) 




NUMBER ol LETTERS Pt* titSSA&E 



V 







'll)): 






:-r£ 



Chart No. 10. — Expected Number op Blanks German Plain Text 
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Chart No. 11. — Expected Number of Blanks Italian Plain Text 

„ ITALIAN 
(ll LETTCt ALPHA 




36 



Example IS . — Given a random text of 104 letters, find the expected number of letters each 
of which occurs no times, once, twice, etc. 

In this case N= 104, n= 26, so that N/n— 4. The desired values are given below in the 
last column; the values in the middle column were obtained from the tables of the Poisson 
exponential distribution for m= 4. 



» 

r 


(l/r!) (4) r e~* 


26 (l/r!) (4) 'e~* 


0 


0. 018316 


0. 476216=0 


i 


. 073263 


1. 904838=2 


2 


. 146525 


3. 809650=4 


3 


. 195367 


5. 079542 = 5 


4 


. 195367 


5. 079542 = 5 


5 


. 156293 


4. 063618=4 


6 


. 104196 


2. 709096=3 


7 


. 059540 


1. 548040 =2 


8 


. 029770 


.774020=1 


9 


. 013231 


. 344006=0 


10 


. 005292 


. 137592=0 


11 


. 001925 


. 050050=0 


12 


. 000642 


. 016692=0 


13 


. 000197 


. 005122=0 


14 


. 000056 


. 001456=0 



In other words, the average random text of 104 letters would show all letters occurring; two 
occurring once each; four occurring twice each; five occurring three times each; five occurring 
four times each; four occurring five times each; three occurring six times each; two occurring 
seven times and one occurring eight times. 

c. In non-random text of N elements, where there are n possible different elements with 
the respective probabilities of occurrence p if p 2 , • • • , p n , the average number of elements 
occurring once each is given by 

(17.7) N^ptil—pi)*?- 1 ; 

i-l 

the average number of elements occurring twice each is given by 



(17.8) 



N(N- 1) 
2 ! 




the average number of elements occurring r times each is given by 



(17.9) 



N(N—1) • ■ • (JV-r+l) 
r! 



^Pt r (l— Pt) N '• 

i=l 



The formulas (17.7), (17.8), and (17.9) may be respectively approximated by 



(17.10) 

(17.11) 



y '(Np t )e- Npt 



i-l 



p(l/2\)(Np t ye~»* 
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(17.12) 2(1 lrl)(Np { ) T e- N P* 

1 = 1 

The formulas in (17.10), (17.11), and (17.12) may also be evaluated by means of the tables 
for the Poisson exponential. 

d. Charts giving the number of letters occurring r times each, for various values of N have 
not been prepared since these variations are to a large extent taken into account in formulas 
to be discussed now. 

18. The <f> test for non-random character of text. — a. It is to be expected, that the variation in 
the number of occurrences of the n possible elements of a text of N elements would be greater 
for non-random text than for random text. Some measure of this variation is desirable as a 
quantative test as to whether or not the text of a cryptogram has been properly arranged into 
its simplest component elements. 

b. Consider a text of N elements in a system where there are n possible elements. Let us 
suppose that there are /i,/ 2 , . . . respectively of each of the different possible elements in the 
text so that/i+/ 2 + . . . +f«=N. 

If we set 1)+/ 2 (/2— 1)+ ■ • • +J n (jn— 1) then it is possible to show that 

(18.1) E(<f>) = s 2 N(N— 1 ) 

where E( ) means the average or expected value of the expression in the parenthesis, and s 2 is 
the sum of the squares of the probabilities of occurrence of each of the n possible elements in the 
system. (The definition of <j> is not as arbitrary as may first appear, but is related to a most 
important concept, that of coincidences, which is discussed in Section VII. In paragraph 256 
of that section is given a proof of (18.1)). 

For monoalphabetic and digraphic distributions (18.1) yields the results shown below: 





E (*) 


Monoalphabetic text 


Digraphic text 


English.— 


0M&1N(N-1) 


0.00697V(2V-1) 


French 


.0778N(N-1) 


.0093AT(2V-1) 


German 


.0762V(V-1) 


.01122VW-1) 


Italian 


.0738iV(2V-l) 


.008lN(N-l) 


Japanese (Romaji)... 


.0819iV(V-l) 


.0116iV(V-l) 


Portuguese - 


.0791iV(iV-l) 




Russian 


.0529V(iV-l) 


.0058iVW-l) 


Spanish 


.0775N(N-1) 


.0093JV(iV-l) 



For random text, s 2 =l/», so that (18.1) yields the results shown below: 



EM 



Random Text 



Monographic 


Digraphic 


Tri graphic 


0.038iV(iV-l) 


0.0015V(iV-l) 


0.000057 jV(N-l) 

















Example 14 .— Does the following represent a selection of English text enciphered mono- 
alphabetically? 



IBMQO PBIUO MBBGA JCZOF MUUQB 



A uniliteral distribution of the text yields the following: 

ABCDEFGHIJKLMNOPQRSTUVWXYZ 

For this case the observed value of <f> is 1X0+5X4+1X0+1X0+1X0+2X1+1X0+3X2+ 
3X2+1X0+2X1+3X2 + 1X0=42. For monoalphabetic text in English the expected value 
is 0.066X25X24=39.6; for random text the expected value is 0.038X25X24=22.8. One must 
conclude that the cipher text is the result of a monoalphabetic substitution, since the observed 
value of <j> (42) more closely approximates the expected value for English plain-text (39.6) than 
it does the expected value for random text (22.8). 

Example 15 . — Does the following represent a selection of English text enciphered mono- 
alphabetically? 

HKWZA RRPBQ BIVYS MPDMQ MVUDC 



A uniliteral distribution of the text yields the following: 

ABCDEFGHIJKLMNOPQRSTUVWXYZ 

For this case the observed value of <!> is 1XO+2X1 + 1 X0+2X1 + 1X0+1X0+1X0+3X2 + 
2X1+2X1+2X1 + 1X0+1X0+2X1 + 1X0+1X0+1X0=18. As in example 14, the ex- 
pected values for monoalphabetic and random text are 39.6 and 22.8 respectively. One must 
conclude that the text is not monoalphabetic. 

For convenience we shall refer to the test described above as the <£ (Phi) test, 
c. From (18.1), there may be derived after some simple manipulation a formula for the 
expected value of the sum of the squares of the number of occurrences of each element. If we 
set ^=/i 2 +/ 2 2 + . . . +/» 2 , then 

(18.2) E(4')=s 2 N 2 +(1-s 2 )N 

The values of s 2 for monoalphabetic and digraphic text, for various languages are shown herewith: 





Monoalphabetic 


Digraphic 


English 


0. 0661 


0. 0069 


French . 


.0778 


.0093 


German 


.0762 


. 0112 


Italian 


.0738 


.0081 


Japanese (Romaji) 


.0819 


.0116 


Portuguese 


.0791 




Russian. 


.0529 


.0058 


Spanish 


.0775 


.0093 



d. An idea of the variation of the observed values of 1 )+/j(/2— 1)+ . . . + 

fn(Jn~ 1) about its expected value is indicated by the variance which is 

(18.3) c* 2 = 4./V 3 (s 3 — s 2 2 ) + 22V 2 (5s 2 2 + s 2 — 6s 3 ) + 2lY(4s 3 — s 2 — 3 s 2 2 ) 




